Goto

Collaborating Authors

 doubly convolutional neural network


Doubly Convolutional Neural Networks

Neural Information Processing Systems

In this paper, we propose doubly convolutional neural networks (DCNNs), which significantly improve the performance of CNNs by further exploring this idea. In stead of allocating a set of convolutional filters that are independently learned, a DCNN maintains groups of filters where filters within each group are translated versions of each other. Practically, a DCNN can be easily implemented by a two-step convolution procedure, which is supported by most modern deep learning libraries. We perform extensive experiments on three image classification benchmarks: CIFAR-10, CIFAR-100 and ImageNet, and show that DCNNs consistently outperform other competing architectures. We have also verified that replacing a convolutional layer with a doubly convolutional layer at any depth of a CNN can improve its performance. Moreover, various design choices of DCNNs are demonstrated, which shows that DCNN can serve the dual purpose of building more accurate models and/or reducing the memory footprint without sacrificing the accuracy.


Reviews: Doubly Convolutional Neural Networks

Neural Information Processing Systems

Added after rebuttal / discussion: Although I still contend that the CIFAR-10 baseline network used in the paper is unnecessarily weak, I appreciate that results on a large-scale dataset such as ImageNet (on which the method is shown to be quite competitive) are much more relevant, and I no longer consider it a major issue. I still disagree with the presentation of the idea (asserting that the filters are "translated versions of each other", when the reason the method works is precisely because they are _not_ exact translated versions of each other, only approximately so), but I guess that can be put down to a matter of taste. That leaves the problem with the CyclicCNN baseline, which I maintain is unnecessarily crippled by removing its ability to use relative orientation information. My issue was not with the fact that this is not explained in enough detail (as the rebuttal seems to imply), but rather that the model is wrong. This form of parameter sharing is pointless, except in very rare cases where relative orientation information is not relevant to the task at hand (I can't think of any situations where this is true, but there might be some).


Doubly Convolutional Neural Networks

Zhai, Shuangfei, Cheng, Yu, Zhang, Zhongfei (Mark), Lu, Weining

Neural Information Processing Systems

In this paper, we propose doubly convolutional neural networks (DCNNs), which significantly improve the performance of CNNs by further exploring this idea. In stead of allocating a set of convolutional filters that are independently learned, a DCNN maintains groups of filters where filters within each group are translated versions of each other. Practically, a DCNN can be easily implemented by a two-step convolution procedure, which is supported by most modern deep learning libraries. We perform extensive experiments on three image classification benchmarks: CIFAR-10, CIFAR-100 and ImageNet, and show that DCNNs consistently outperform other competing architectures. We have also verified that replacing a convolutional layer with a doubly convolutional layer at any depth of a CNN can improve its performance.